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The oxygen content of the gas-fired boiler flue gas is used to monitor boiler 
combustion efficiency. Conventionally, this oxygen content is measured 
using an oxygen content sensor. However, because it operates in extreme 
conditions, this oxygen sensor tends to have the disadvantage of high 
maintenance costs. In addition, the absence of other sensors as an element of 


redundancy and when there is damage to the sensor causes manual handling 

by workers. It is dangerous for these workers, considering environmental 
Keywords: conditions with high-risk hazards. We propose an artificial neural network 
(ANN) and random forest-based soft sensor to predict the oxygen content to 
overcome the problems. The prediction is made by utilizing measured data 
on the power plant’s boiler, consisting of 19 process variables from a 
distributed control system. The research has proved that the proposed soft 
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Prediction sensor successfully predicts the oxygen content. Research using random 
Random forest forest shows better performance results than ANN. The random forest 
Soft sensor prediction errors are mean absolute error (MAE) of 0.0486, mean squared 


error (MSE) of 0.0052, root-mean-square error (RMSE) of 0.0718, and Std 
Error of 0.0719. While the errors using ANN are MAE of 0.0715, MSE of 
0.0087, RMSE of 0.0935, and Std Error of 0.0935. 
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1. INTRODUCTION 

A steam boiler system is a closed vessel that uses fuel or electricity to generate steam to supply the 
power plant system [1]-[3]. The energy produced by the boiler in the form of heat comes from the 
combustion process in the combustion chamber or furnace. In general, the combustion reaction in the boiler 
furnace can be made using three main components, namely fuel, air, and fire from the lighter. 

The combustion efficiency in the boiler furnace describes the ability of a burner to burn the entire 
fuel entering the furnace. The ideal furnace combustion reaction occurs when the oxygen in the air is 
sufficient to burn the whole fuel [4]-[6]. So, there is no remaining oxygen or energy in the flue gas. 
However, the oxygen volume will be insufficient to burn all the fuel when there is an imperfect mixture of 
fuel and oxygen. Therefore, it is necessary to have an appropriate combination of the required amount of 
energy and oxygen. In addition, more fresh air is needed to burn the entire fuel. This air is known as excess 
air. There is still oxygen content in the flue gas in combustion with excess air. Therefore, the amount of 
unburned fuel and the remaining oxygen excess air on the flue gas can be utilized to estimate the combustion 
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efficiency [7]. In addition to saving fuel use, high combustion efficiency will also reduce air pollution 
generated from this combustion process [8], [9]. 

The optimal oxygen content depends on the type of furnace used. If the oxygen content is too low, 
unburned fuel will decrease air quality. On the other hand, if the oxygen content is too high, the furnace will 
be inefficient due to a large amount of energy lost through the flue gas. 

The oxygen content of the flue gas of a steam boiler system can be conventionally measured by 
oxygen sensors, such as using zirconium oxide. Zirconium oxide is a material capable of measuring oxygen 
levels in flue gas. However, oxygen sensors with zirconium oxide tend to have the disadvantage of high 
maintenance costs. In addition, the absence of other sensors as an element of redundancy and when there is 
damage to the sensor causes manual handling by workers using portable measuring devices. It is dangerous 
for these workers, considering the environmental conditions with high-risk hazards. 

To overcome the problems, we propose to use a soft sensor using two types of machine learning as 
the soft sensor: an artificial neural network (ANN) [10]-[12] and random forest [13], [14]. The soft sensor is 
a software-based method that utilizes an intelligent system to solve a problem based on the input-output 
relationship [15]-[19]. Several researchers have applied machine learning for prediction and pattern 
recognition as the primary key of the soft sensor [20]-{25]. Other researchers proposed the soft sensor to 
indicate the oxygen content of flue gas using the support vector model and mixed model [26]. 

The rest of this paper is described as follows. First, the research method is described in detail in 
section 2. Then, section 3 describes the experimental results, including the training and testing of the soft 
sensor using ANN and random forest. Finally, in section 4, the conclusion of this study is offered. 


2. RESEARCH METHOD 

In this paper, the proposed soft sensor to predict the oxygen content in the steam boiler flue gas is 
shown in Figure 1. This research was conducted in several stages: data collection, data preprocessing, soft 
sensor design using ANN and random forest, training, and performance evaluation of the soft sensor. The 
schematic diagram of a power plant’s boiler is shown in Figure 2 [27]. 
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Figure 1. Diagram of the proposed soft sensor system to predict the oxygen content of the steam boiler flue 
gas using neural networks and random forest 


2.1. Data collection 

Data from the steam boiler system of a 32 MW power plant in an oil refinery unit, Indonesia, are 
collected from 1 January until 28 August. The boiler used in this study is a type of water tube boiler used to 
heat water to become superheated steam. The steam is fed to the steam turbine generator. The generator 
supplies all power requirements for processing operations at the oil refinery unit. The collected data consisted 
of 19 parameters, including oxygen content, and was acquired from a distributed control system historical 
data system. Table 1 lists the process variables of the steam boiler used in the research. 


2.2. Data preprocessing 

After data collection, the data preprocessing is carried out. The preprocessing consists of several 
steps: handling missing values, separating training data and test data, and data normalization. The results of 
the data preprocessing are then used to feed ANN and the random forest system. 
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Figure 2. A schematic diagram of a steam boiler system [26] 


Table 1. Process variables of the steam boiler system 


No Process variables Unit 

1 Deaerator level mm 

2 The feed flow rate of boiler water to the superheater ton/hour 
3 The temperature of advanced steam in the superheater ad} 

4 Feedwater flow rate Kg/hour 
5 Main gas inlet flow rate to the furnace N.m°*/hour 
6 Fuel gas pressure behind the control valve Kg/cm? 
7 Combustion air flow rate Kg/ hour 
8 Air pressure of burner box mmH,O 
9 Main steam temperature C 

10 Furnace exhaust gas pressure mmH,0 
11 The temperature of boiler Flue gas °C 

12 Boiler steam pressure kg/em2 
Hke} Wind box pressure mmH,O 
14 Combustion air temperature °C 

15 Steam drum boiler levels % 

16 Primary steam header flow rate kg/hour 
17 The water inlet temperature of the economizer °C 

18 The water outlet temperature of the economizer °C 
19 Oxygen content % 


2.3. ANN soft sensor 

After preprocessing the data, we design an ANN soft sensor. The ANN soft sensor is created using 
Python3 with libraries of NumPy, pandas, Matplotlib, Keras, and TensorFlow framework [28]—[30]. The 
hyperparameters used in the research are the number of neurons, the number of epochs, feature selection, and 
the early stopping strategy [31], [32]. The hidden layer consists of various neurons from 4 to 64. We used 
ReLU as the activation function in the hidden layer and mean squared error (MSE) as the loss function. The 
optimizer used as the determinant of the learning process is SGD [33]-[36]. After ANN soft sensor was 
designed, it was trained with a certain number of epochs. After the ANN training, the next step is to test the 
ANN. The test was conducted by comparing the prediction results of ANN with the target. 
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2.4. Random forest soft sensor 

Besides using ANN, we also design a soft sensor using a random forest for the oxygen content 
prediction of the flue gas. The performance of both models is then compared. We compare the mean absolute 
error (MAE), MSE, root-mean-square error (RMSE), and Std Error of both models. 


3. RESULTS AND DISCUSSION 

After training with 1,000 epochs, ANN soft sensor performance was evaluated using test data. The 
experiments show that the best ANN architecture in this research is with 60 neurons in the hidden layer. 
Figure 3 shows the MAE of the ANN soft sensor with 60 neurons in the hidden layer. MAE in this 
experiment tends to decrease with increasing epoch. There is a slight increase in validation errors in certain 
epochs, even though this validation error generally tends to decrease with increasing epochs. 

The experimental results indicate that this ANN successfully predicts the oxygen content of the 
steam boiler flue gas. Figure 4 shows the histogram of the prediction error of ANN. The distribution of error 
is typically distributed at the slightest error. It means that most of the prediction results have a relatively 
small error. 
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Figure 3. MAE of the ANN soft sensor with 60 neurons in the hidden layer 
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Figure 4. Histogram of the prediction error of the soft sensor with 60 neurons in the hidden layer 


In this study, experiments were also conducted on the design of the oxygen content prediction 
system using a random forest. Figure 5 shows the system prediction error results compared with the oxygen 
content prediction system using ANN. The results of this study show that the random forest outperforms the 
ANN. The random forest prediction errors are MAE of 0.0486, MSE of 0.0052, RMSE of 0.0718, and Std 
Error of 0.0719. While the errors using ANN are MAE of 0.0715, MSE of 0.0087, RMSE of 0.0935, and Std 
Error of 0.0935. The model performance can also be investigated through the relationship between the flue 
gas's predicted and measured oxygen content. The relation of the random forest soft sensor is shown in 
Figure 6. The predicted and the measured values are almost close to the linear line. 
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Figure 5. Comparison of the prediction errors of the soft sensors using artificial neural networks 
and random forest 
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Figure 6. The predicted and measured oxygen content of flue gas using a random forest soft sensor system 


4. CONCLUSIONS 

In this paper, we propose a soft sensor system to predict the oxygen content of the steam boiler flue 
gas using ANN and random forest. From the experimental results, this soft sensor system has been proven to 
be successful in predicting the oxygen content in the flue gas of the steam boiler. The random forest oxygen 
content prediction system showed better performance than the ANN system. The random forest prediction 
errors are MAE of 0.0486, MSE of 0.0052, RMSE of 0.0718, and Std Error of 0.0719. While the errors using 
ANN are MAE of 0.0715, MSE of 0.0087, RMSE of 0.0935, and Std Error of 0.0935. 
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